-
-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add make_set function for DisjointSets #38692
Add make_set function for DisjointSets #38692
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting new functionality. However, why limiting to one extra element ?
cdef int *new_parent, *new_rank, *new_mcr, *new_size | ||
|
||
cdef int *int_array = <int *> sig_malloc( 4*(n+1) * sizeof(int) ) | ||
if int_array is NULL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really want to call free on NULL ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, I'll fix that.
new_rank = int_array + (n + 1) | ||
new_mcr = int_array + (2*n + 2) | ||
new_size = int_array + (3 * n + 3) | ||
for i from 0 <= i < n: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for i in range(n)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will that work for c? If you look at all the other for loops in the .pxd file they all have the same format:
for i from 0 <= i < n:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K, I've gone ahead and changed it.
For future: is there a reason why all the other ones use for i from 0 <= i < n
? Because every single other for loop in the file uses this syntax rather than the for i in range
syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods have been written long time ago and Cython has evolved (improved) since. We have not update all the code yet, it's too big.
src/sage/sets/disjoint_set.pyx
Outdated
@@ -834,6 +854,43 @@ cdef class DisjointSet_of_hashables(DisjointSet_class): | |||
cdef int j = <int> self._el_to_int[f] | |||
OP_join(self._nodes, i, j) | |||
|
|||
def make_set(self, new_elt = None): | |||
r""" | |||
Return a new disjoint set with an additional item. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are not returning a new set, you are adding one new element to the set
new_size = int_array + (3 * n + 3) | ||
for i from 0 <= i < n: | ||
new_parent[i] = OP.parent[i] | ||
new_rank[i] = OP.rank[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not using low level memcpy
method ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not perfect with c, but if I understand correctly, memcpy
will also copy the amount of memory allocated for the object. Since we are working with arrays, that would mean that the new copy would not have enough memory allocated to it since we need to add an additional entry into the array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first option is to make 4 different mallocs for parent, rank, etc. when first creating the data structure.
Then, in make_set
, you can use realloc
to extend the size of the arrays. Realloc should copy the data for you and so you only have to allocate the last value. In case of error, one of these arrays will get a NULL pointer (unless I'm mistaken). Last, when deallocating, you have to do 4 free.
The other option is to use memcpy
instead of the for loop to put the good values in new_...
. It should be a call like memcpy(new_parent, OP.parent, n)
. It remains to assign the last cell of new_parent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm not as comfortable with C, so I'll see if I can try and use these methods.
From what I understand from reading wikipedia, |
I agree that |
Yes, but it becomes ambiguous once you do that. For example, if I were to do: We could still make a function that does allow making multiple sets (say |
Let's keep this for a future PR. |
Sounds good. I also tried to use the |
Fix memcpy allocation
cdef int *new_parent, *new_rank, *new_mcr, *new_size | ||
|
||
cdef int *int_array = <int *> sig_malloc( 4*(n+1) * sizeof(int) ) | ||
if int_array is not NULL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if int_array
is NULL
? you should certainly raise an error.
new_size[n] = 1 | ||
|
||
OP.parent = new_parent | ||
OP.rank = new_rank |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please avoid the extra spaces before =
. This is not compatible with Python coding style.
|
||
INPUT: | ||
|
||
- ``new_elt`` -- (optional) element to add. If `None`, then an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the second line, you must add 2 extra spaces, like:
- ``new_elt`` -- (optional) element to add. If `None`, then an integer
is added.
this currently breaks building the pdf documentation (reported by CI).
src/sage/sets/disjoint_set.pyx
Outdated
@@ -834,6 +854,43 @@ cdef class DisjointSet_of_hashables(DisjointSet_class): | |||
cdef int j = <int> self._el_to_int[f] | |||
OP_join(self._nodes, i, j) | |||
|
|||
def make_set(self, new_elt = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't put spaces around =
here. It should be def make_set(self, new_elt=None):
cdef int *int_array = <int *> sig_malloc(4*(n+1) * sizeof(int)) | ||
if int_array is NULL: | ||
raise MemoryError("MemoryError allocating int_array in make_set method") | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, you can remove the else
and reduce the indentation of the code. Indeed, the code will be executed only if int_array
is not NULL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I'll switch it over.
Documentation preview for this PR (built with commit 598ccfd; changes) is ready! 🎉 |
new_mcr[n] = n | ||
new_size[n] = 1 | ||
|
||
OP.parent = new_parent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before this statement, you must release the memory used by the previous arrays, so sig_free(OP.parent)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I'm not so good with memory allocation stuff, but I think we only need sig_free(OP.parent)
correct? I tried adding the other ones and it started throwing errors, so I'm thinking that OP.parent
holds the entire memory for all 4 variables so freeing up the memory from that element does it for all 4. If I did that wrong, just let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
<!-- ^ Please provide a concise and informative title. --> <!-- ^ Don't put issue numbers in the title, do this in the PR description below. --> <!-- ^ For example, instead of "Fixes sagemath#12345" use "Introduce new method to calculate 1 + 2". --> <!-- v Describe your changes below in detail. --> <!-- v Why is this change required? What problem does it solve? --> <!-- v If this PR resolves an open issue, please link to it here. For example, "Fixes sagemath#12345". --> This fixes sagemath#35599 by adding a `make_set` function to `DisjointSet` using `OrbitPartitions`. The documentation links to wikipedia. From wikipedia, the method should be done in place and therefore there is no return ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [x] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on. For example, --> <!-- - sagemath#12345: short description why this is a dependency --> <!-- - sagemath#34567: ... --> sagemath#35599 URL: sagemath#38692 Reported by: Aram Dermenjian Reviewer(s): Aram Dermenjian, David Coudert
This fixes #35599 by adding a
make_set
function toDisjointSet
usingOrbitPartitions
. The documentation links to wikipedia.From wikipedia, the method should be done in place and therefore there is no return
📝 Checklist
⌛ Dependencies
#35599